Simultaneous selection of variables and smoothing parameters in structured additive regression models
نویسندگان
چکیده
In recent years, considerable research has been devoted to developing complex regression models that can deal simultaneouslywith nonlinear covariate effects and time trends, unitor cluster specific heterogeneity, spatial heterogeneity and complex interactions between covariates of different ∧ types. Much less effort, however, has been devoted to model and variable selection. The paper develops ∧ a methodology for the simultaneous selection of variables and the degree of smoothness in regression models with a structured additive predictor. Thesemodels are quite general, containing additive (mixed)models, geoadditive models and varying coefficient models as special cases. ∧ This approach allows one to decide whether a particular covariate enters the model linearly or nonlinearly or is removed from the model. Moreover, it is possible to decide whether a spatial or cluster specific effect should be incorporated into themodel to copewith spatial or cluster specific heterogeneity. Particular emphasis is also placed on selecting complex interactions between covariates and effects of different types. A new penalty for two-dimensional smoothing is proposed, that allows for ANOVA-type decompositions into main effects and an interaction effect without explicitly specifying the main effects. The penalty is an additive combination of other penalties. Fast algorithms and software are developed that allow ∧ one to even handle situations with many covariate effects and observations. The algorithms are related to backfitting andMarkov chainMonte Carlo techniques, which divide the problem in a divide and conquer strategy into smaller pieces. Confidence intervals taking model uncertainty into account are based on the bootstrap in combination with MCMC techniques. © 2008 Elsevier B.V. All rights reserved.
منابع مشابه
Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations
Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalised) linear models, (generalised) additive models, smoothing-spline models, state-space models, semiparametric regression, spatial and spatio-temporal models, log-Gaussian Cox-processes, and geostatistical models. In this paper we consider app...
متن کاملGeneralized additive modelling with implicit variable selection by likelihood based boosting
The use of generalized additive models in statistical data analysis suffers from the restriction to few explanatory variables and the problems of selection of smoothing parameters. Generalized additive model boosting circumvents these problems by means of stagewise fitting of weak learners. A fitting procedure is derived which works for all simple exponential family distributions, including bin...
متن کاملApproximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations
Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider app...
متن کاملDimension reduction and parameter estimation for additive index models
In this paper, we consider simultaneous model selection and estimation for the additive index model. The additive index model is a class of structured nonparametric models that can be expressed as additive models of a set of unknown linear transformation of the original predictor variables. We introduce a penalized least squares estimator and discuss how it can be efficiently computed in practi...
متن کاملSemiparametric regression models with additive nonparametric components and high dimensional parametric components
This paper concerns semiparametric regression models with additive nonparametric components and high dimensional parametric components under sparsity assumptions. To achieve simultaneous model selection for both nonparametric and parametric parts, we introduce a penalty that combines the adaptive empirical L2-norms of the nonparametric component functions and the SCAD penalty on the coefficient...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 53 شماره
صفحات -
تاریخ انتشار 2008